Model selection and score normalization for text-dependent single utterance speaker verification

نویسندگان

  • Osman BÜYÜK
  • Mustafa Levent ARSLAN
چکیده

In this paper, we investigate model selection and channel variability issues on a text-dependent single utterance (TDSU) speaker verification application. Due to the lack of an appropriate database for the task, a multichannel speaker recognition database, which consists of multiple recordings of a single Turkish utterance, is collected. The first set of experiments is devoted to model selection. Phonetic hidden Markov model (HMM)-based, sentence HMM-based, and Gaussian mixture model (GMM)-based methods are compared to find the most appropriate modeling approach for the target application. Based on the experimental results, the HMM-based methods outperform the GMM. The sentence HMM yields the best performance among the 3 approaches. In the second set of experiments, we implement various score normalization techniques in order to compensate for channel mismatch conditions. Test normalization, zero normalization, and their combinations are investigated for the TDSU task. We propose a novel combination procedure named combined normalization (C-norm). We also benefit from prior knowledge of the handset-channel type in order to improve the verification performance. A cohort-based channel detection procedure is presented to identify enrollment/authentication channels in addition to the GMM-based method. In score normalization, handsetdependent C-norm results in the best performance, with a 0.72% equal error rate (EER) in the ideal channel known case and a 0.74% EER when the GMM and cohort-based systems are combined together for channel detection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Telephone-based Text-dependent Speaker Verification

TELEPHO E-BASED TEXT-DEPE DE T SPEAKER VERIFICATIO In this thesis, we investigate model selection and channel variability issues on telephone-based text-dependent speaker verification applications. Due to the lack of an appropriate database for the task, we collected two multi-channel speaker recognition databases which are referred to as text-dependent variable text (TDVT-D) and textdependent ...

متن کامل

Score-level compensation of extreme speech duration variability in speaker verification

In this work we aim at compensating the degrading effects of utterance length variability of speaker verification systems, which appear in many typical applications such as forensics. The paper concentrates in the score misalignments due to different utterance lengths, proposing several algorithms for its normalization. In order to test the proposed methods, we have built two corpora from NIST ...

متن کامل

Successive cohort selection (SCS) for text-independent speaker verification

A novel cohort selection method, namely, successive cohort selection (SCS) is presented in this paper for text-independent speaker verification. The proposed method computes distance between two models directly and it selects new cohort member based on both the claimed speaker model and the existing cohort members. In addition to this new cohort selection method, we also propose a new score mea...

متن کامل

Text-independent speaker verification using utterance level scoring and covariance modeling

This paper describes a computationally simple method to perform text independent speaker verification using second order statistics. The suggested method, called utterance level scoring (ULS), allows obtaining a normalized score using a single pass through the frames of the tested utterance. The utterance sample covariance is first calculated and then compared to the speaker covariance using a ...

متن کامل

Robust speaker verification insensitive to session-dependent utterance variation and handset-dependent distortion

This paper investigates a method of creating robust speaker models that are not sensitive to session-dependent (SD) utterance-variation and handset-dependent (HD) distortion for hidden Markov model (HMM)-based speaker veri cation systems in a real telephone network. We recently reported a method of creating session-independent (SI) speaker-HMMs that are not sensitive to SD utterance-variation. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012